Data Donation Workshop - Notebook I:
Exporting and preprocessing WhatsApp Chat Logs

Author

Julian Kohne

Published

March 24, 2025

Setup

Setting some knitr options for better display in the Notebook

# options
knitr::opts_chunk$set(fig.align = "center")

Setting working directory

# Set the working directory to the directory of the current source file
if (rstudioapi::isAvailable()) {
  setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
} else {
  message("Not running in RStudio; working directory not changed.")
}
Not running in RStudio; working directory not changed.

Installing and importing all necessary libraries

# Define the list of packages you need
pkg_list <- c("ggplot2", "rstudioapi","patchwork")  # add any packages you need

# Identify packages in the list that are not yet installed
new_packages <- pkg_list[!(pkg_list %in% installed.packages()[, "Package"])]

# Install any new packages
if (length(new_packages)) install.packages(new_packages)

# Load all packages from the list
invisible(lapply(pkg_list, library, character.only = TRUE))
library(WhatsR)




Exporting Data

You can either export a chat log by yourself or use ones that are provided in the data folder. You can also generate a dummy chat log by yourself and use this one for the workshop.

Exporting your own Chat log

The easiest option is to export your chat log file using the email option. This will send you either a plain .txt file or a zip file containing your chat log file. After downloading the file attachment from the email, you can unzip the file (if necessary) to get access to the .txt file with your chat log.

Using the provided Chat log(s)

You can find several pre-generated chat logs in the data folder. You can use these files for the workshop.

Generating a dummy chat log

You can also use WhatsR to generate your own dummy chat log file. Depending the parameters you chose, these files can get quite large and might have weird combinations of features, but the default settings should be relatively safe.

# checking documentation
??create_chatlog

# generating a chat log file [prints to console and to a file]
WhatsR::create_chatlog(
  n_messages = 150,
  n_chatters = 2,
  n_emoji = 50,
  n_diff_emoji = 20,
  n_links = 20,
  n_locations = 5,
  n_smilies = 20,
  n_diff_smilies = 15,
  n_media = 10,
  media_excluded = TRUE,
  n_sdp = 3,
  n_deleted = 5,
  startdate = "01.01.2019",
  enddate = "31.12.2022",
  language = "german",
  time_format = "24h",
  os = "android",
  path = getwd(),
  chatname = "Simulated_WhatsR_chatlog"
  )




Chat Log File Structure

The chat log file is saved as a .txt file and can be opened with any text editor of your choice. Some good options are:

  1. Windows: Notepad++

  2. MacOS: TextMate

  3. Linux: Notepadqq

General Structure

In the most basic form, a chat log file consists of a list of messages with 3 parts each

  1. Timestamp: Date and time when the message was sent

  2. Sender Name: The name or telephone number of the person who sent the message. If the person who exported the message has not saved the sender in their phone contacts, this will display the senders phone number in this format +49 1234567890. If the sender is saved in the telephone contacts, the corresponding name will be displayed.

  3. Message Body: The message that was sent. Includes emoji, smileys, links, locations etc.. For voice or video calls, an indicator is present in the text file.

Example Structure for: Android, German, 24h time format:


01.01.16, 03:14 - Janae: Duis cum dolor congue posuere dignissim proin :D :D
01.01.16, 04:38 - Kelsee: Commodo ullamcorper scelerisque ligula?  🧗🏽‍♂️ 👉
08.01.16, 10:49 - Janae: <Videonachricht weggelassen>
15.01.16, 19:18 - Kelsee: <Medien ausgeschlossen>
13.02.16, 10:48 - Janae: Erat?!
🏃🏽 🤸🏾‍♀
14.02.16, 09:49 - Janae: Ultricies in pellentesque
  

Unfortunately, even though the chat log file looks very well structured at first, there are many factors influencing the exact layout of the file. This can make it difficult to process multiple files coming from different people using different phones and having different settings. The following factors all influence the layout of the chat log file:

  • Whether the file was exported with or without media files
  • Whether the file exported from an Android or iOS phone
  • The language setting of the phone the file was exported from
  • The time format setting of the phone the file was exported from

Let’s go through these one by one and check the differences!

With media and without media

As mentioned before, the chat log file can be exported with or without media files. If the file is exported with media files, the chat log file will contain a line containing a media file indicator string and the name of the file. If the file is exported without media files, the chat log file will contain a line indicating that a media file was excluded.

Example Structure for: Android, German, 24h time format:

06.04.19, 09:14 - Frank: Schickst du das Foto von gestern mal? :D :D
06.04.19, 09:38 - Bob: KLar! Moment...
06.04.19, 11:18 - Bob: ‎IMG-20231223-WA0000.jpg (Datei angehängt)
06.04.19, 11:48 - Frank: Und das Video?
06.04.19, 11:49 - Bob: Kommt sofort!
06.04.19, 11:49 - Frank: ‎VID-20231227-WA0002.mp4 (Datei angehängt)
  

Example Structure for: Android, German, 24h time format:

06.04.19, 09:14 - Frank: Schickst du das Foto von gestern mal? :D :D
06.04.19, 09:38 - Bob: KLar! Moment...
06.04.19, 11:18 - Bob: <Medien ausgeschlossen>
06.04.19, 11:48 - Frank: Und das Video?
06.04.19, 11:49 - Bob: Kommt sofort!
06.04.19, 11:49 - Frank: <Medien ausgeschlossen>
  

User vs. System generated messages

WhatsApp adds some information into chat logs as system messages. These are problematic for parsing because they do have a timestamp but not a sender.

Example Structure for: Android, German, 24h time format:

27.01.19, 18:51 - Nachrichten und Anrufe sind Ende-zu-Ende-verschlüsselt. Niemand außerhalb dieses Chats kann sie lesen oder anhören, nicht einmal WhatsApp. Tippe, um mehr zu erfahren.
06.04.19, 09:14 - Frank: Schickst du das Foto von gestern mal? :D :D
06.04.19, 09:38 - Bob: KLar! Moment...
06.04.19, 11:18 - Bob: <Medien ausgeschlossen>
06.04.19, 11:48 - Frank: Und das Video?
06.04.19, 11:49 - Bob: Kommt sofort!
06.04.19, 11:49 - Frank: <Medien ausgeschlossen>
06.04.19, 11:49 - Frank hat eine neue Telefonnummer. Tippe, um eine Nachricht zu schreiben oder die neue Nummer hinzuzufügen.
  

Language

The exact strings of media file indiators and system messages differ, depending on the language settings of the exporting phones. This makes it more difficult to parse data from chat logs exported from phones with different language settings.

Tip

The language setting of the exporting phone only affects the messages inserted by WhatsApp into the chat. The actual message content is always in the language that the people also write in. This can result in a chat where the system messages and media indicators are in a different language than the actual chat messages!

Example Structure for: Android, German, 24h time format:

27.01.19, 18:51 - Nachrichten und Anrufe sind Ende-zu-Ende-verschlüsselt. Niemand außerhalb dieses Chats kann sie lesen oder anhören, nicht einmal WhatsApp. Tippe, um mehr zu erfahren.
06.04.19, 09:14 - Frank: Schickst du das Foto von gestern mal? :D :D
06.04.19, 09:38 - Bob: KLar! Moment...
06.04.19, 11:18 - Bob: <Medien ausgeschlossen>
06.04.19, 11:48 - Frank: Und das Video?
06.04.19, 11:49 - Bob: Kommt sofort!
06.04.19, 11:49 - Frank: <Medien ausgeschlossen>
06.04.19, 11:49 - Frank hat eine neue Telefonnummer. Tippe, um eine Nachricht zu schreiben oder die neue Nummer hinzuzufügen.
  

Example Structure for: Android, English, 24h time format:

27.01.19, 18:51 - Messages and calls are end-to-end encrypted. No one outside of this chat, not even WhatsApp, can read or listen to them.
06.04.19, 09:14 - Frank: Schickst du das Foto von gestern mal? :D :D
06.04.19, 09:38 - Bob: KLar! Moment...
06.04.19, 11:18 - Bob: <media omitted>
06.04.19, 11:48 - Frank: Und das Video?
06.04.19, 11:49 - Bob: Kommt sofort!
06.04.19, 11:49 - Frank: <media omitted>
06.04.19, 11:49 - Frank changed their phone number to a new number. Tap to message or add the new number.
  

Media file indicators

Example Structure for: Android, German, 24h time format:

27.01.19, 18:51 - Nachrichten und Anrufe sind Ende-zu-Ende-verschlüsselt. Niemand außerhalb dieses Chats kann sie lesen oder anhören, nicht einmal WhatsApp. Tippe, um mehr zu erfahren.
06.04.19, 09:14 - Frank: Schickst du das Foto von gestern mal? :D :D
06.04.19, 09:38 - Bob: KLar! Moment...
06.04.19, 11:18 - Bob: <Medien ausgeschlossen>
06.04.19, 11:48 - Frank: Und das Video?
06.04.19, 11:49 - Bob: Kommt sofort!
06.04.19, 11:49 - Frank: <Medien ausgeschlossen>
06.04.19, 11:49 - Frank hat eine neue Telefonnummer. Tippe, um eine Nachricht zu schreiben oder die neue Nummer hinzuzufügen.
  

Example Structure for: Android, English, 24h time format:

27.01.19, 18:51 - Messages and calls are end-to-end encrypted. No one outside of this chat, not even WhatsApp, can read or listen to them.
06.04.19, 09:14 - Frank: Schickst du das Foto von gestern mal? :D :D
06.04.19, 09:38 - Bob: KLar! Moment...
06.04.19, 11:18 - Bob: <media omitted>
06.04.19, 11:48 - Frank: Und das Video?
06.04.19, 11:49 - Bob: Kommt sofort!
06.04.19, 11:49 - Frank: <media omitted>
06.04.19, 11:49 - Frank changed their phone number to a new number. Tap to message or add the new number.
  

AM/PM vs. 24h format

The timestamps in the exported chat logs depend on the date and time settings of the exporting phone. The most common formats are the 24h format and the AM/PM format. The 24h format is more common in Europe, while the AM/PM format is more common in the US.

Example Structure for: Android, German, 24h time format:

27.01.19, 18:51 - Nachrichten und Anrufe sind Ende-zu-Ende-verschlüsselt. Niemand außerhalb dieses Chats kann sie lesen oder anhören, nicht einmal WhatsApp. Tippe, um mehr zu erfahren.
06.04.19, 09:14 - Frank: Schickst du das Foto von gestern mal? :D :D
06.04.19, 09:38 - Bob: KLar! Moment...
06.04.19, 11:18 - Bob: <Medien ausgeschlossen>
06.04.19, 11:48 - Frank: Und das Video?
06.04.19, 11:49 - Bob: Kommt sofort!
06.04.19, 11:49 - Frank: <Medien ausgeschlossen>
06.04.19, 11:49 - Frank hat eine neue Telefonnummer. Tippe, um eine Nachricht zu schreiben oder die neue Nummer hinzuzufügen.
  

Example Structure for: Android, German, am/pm time format:

27.01.19, 06:51 PM - Nachrichten und Anrufe sind Ende-zu-Ende-verschlüsselt. Niemand außerhalb dieses Chats kann sie lesen oder anhören, nicht einmal WhatsApp. Tippe, um mehr zu erfahren.
06.04.19, 09:14 PM - Frank: Schickst du das Foto von gestern mal? :D :D
06.04.19, 09:38 PM - Bob: KLar! Moment...
06.04.19, 11:18 PM - Bob: <media omitted>
06.04.19, 11:48 PM - Frank: Und das Video?
06.04.19, 11:49 PM - Bob: Kommt sofort!
06.04.19, 11:49 PM - Frank: <media omitted>
06.04.19, 11:49 PM - Frank hat eine neue Telefonnummer. Tippe, um eine Nachricht zu schreiben oder die neue Nummer hinzuzufügen.
  

Android vs. iOS

Unfortunately, chats exported from Android and iOS phones have different structures with respect to multiple features. The following are some of the differences:

Timestamps

For both operating systems, the timestamp format depends mainly on the datetime setting of the exporting phone. However, even for the same timestamp formats, iOS and Android can have different formattings. For example, the following timestamps are both in 24h format (in the phone settings) but still different on iOs and Android:

Example Structure for: Android, English, 24h time format:

06.04.19, 09:14 - Frank: Hi Bob, how are you doing?
06.04.19, 09:38 - Bob: Fine, how about you?
06.04.19, 11:18 - Bob: Did you see the game yesterday?
06.04.19, 11:48 - Frank: Yes!! What a great overtime finish!
06.04.19, 11:49 - Bob: For real! I was so nervous!
06.04.19, 10:49 - Frank: Wanna go to the stadium next time?
  

Example Structure for: iOS, English, 24h time format:

06.04.19, 09:14:23 - Frank: Hi Bob, how are you doing?
06.04.19, 09:38:16 - Bob: Fine, how about you?
06.04.19, 11:18:01 - Bob: Did you see the game yesterday?
06.04.19, 11:48:56 - Frank: Yes!! What a great overtime finish!
06.04.19, 11:49:33 - Bob: For real! I was so nervous!
06.04.19, 10:49:12 - Frank: Wanna go to the stadium next time?
  

As you can see, chats exported from iPhones in 24h format have seconds in the timestamp, while chats exported from Android phones do not. This is a small difference but can cause problems when parsing the data and also represents a different level of time resolution.

Media file indicators

For Android and iOs, the indicator strings for media files are different. This is true both for exports with media files and without media files.

With media files

Example Structure for: Android, German, 24h time format:

06.04.19, 09:14 - Frank: Schickst du das Foto von gestern mal? :D :D
06.04.19, 09:38 - Bob: KLar! Moment...
06.04.19, 11:18 - Bob: ‎IMG-20231223-WA0000.jpg (Datei angehängt)
06.04.19, 11:48 - Frank: Und das Video?
06.04.19, 11:49 - Bob: Kommt sofort!
06.04.19, 10:49 - Frank: ‎VID-20231227-WA0002.mp4 (Datei angehängt)
  

Example Structure for: iOS, German, 24h time format:

06.04.19, 09:14:23 - Frank: Schickst du das Foto von gestern mal? :D :D
06.04.19, 09:38:16 - Bob: KLar! Moment...
06.04.19, 11:18:01 - Bob:<Anhang: ‎IMG-20231223-WA0000.jpg>
06.04.19, 11:48:56 - Frank: Und das Video?
06.04.19, 11:49:33 - Bob: Kommt sofort!
06.04.19, 10:49:12 - Frank:<Anhang: ‎VID-20231227-WA0002.mp4>
  

Without media files

Example Structure for: Android, German, 24h time format:

06.04.19, 09:14 - Frank: Schickst du das Foto von gestern mal? :D :D
06.04.19, 09:38 - Bob: KLar! Moment...
06.04.19, 11:18 - Bob: <Medien ausgeschlossen>
06.04.19, 11:48 - Frank: Und das Video?
06.04.19, 11:49 - Bob: Kommt sofort!
06.04.19, 10:49 - Frank: <Medien ausgeschlossen>
  

Example Structure for: iOS, German, 24h time format:

06.04.19, 09:14:23 - Frank: Schickst du das Foto von gestern mal? :D :D
06.04.19, 09:38:16 - Bob: KLar! Moment...
06.04.19, 11:18:01 - Bob:Bild weggelassen
06.04.19, 11:48:56 - Frank: Und das Video?
06.04.19, 11:49:33 - Bob: Kommt sofort!
06.04.19, 10:49:12 - Frank:Video weggelassen
  




Parsing Chat log file(s)

To effectively parse a WhatsApp chat log, we need to know:

  • the operating system of the device that exported the chat log

  • the language setting of the exporting phone

  • the date-time format of the exporting phone

  • and whether the file was exported including media files or not.

Because WhatsApp data donations do not provide us with this meta-information directly, we need to infer them from the structure of the chat logs automatically, and then use them to parse the data correspondingly.

Below, you can find a schematic process for parsing a chat log file based on different features.

 flowchart TD
    A1[Read Chat File] --> A2[Detect phone OS based on media indicators and time format]
    A2 -- iOS --> A3[Detect phone language setting based on media indicators and system messages]
    A2 -- Android --> A3[Detect phone language setting based on media indicators and system messages]
    A3 -- German--> A4[Replace special characters and delete left-to-right-markers and non-zero-width-breaking-spaces]
    A3 -- English--> A4[Replace special characters and delete left-to-right-markers and non-zero-width-breaking-spaces]
    A4 --> A5[Parse each message into a DateTime, Sender, and Message column]
    A5 --> B1[Timestamp]
    A5 --> B2[Sender]
    A5 --> B3[Message]
    B3 --> B4[Distinguish user-generated messages from system messages]
    B4 --> C1[System Messages]
    B4 --> D1[Message Text]
    D1 --> A7[Detecting self-deleting photos]
    A7 --> A8[Extracting message features from message column]
    A8 --> A9[Emoji]
    A9 --> A20[EmojiDescriptions]
    A8 --> A10[Smilies]
    A8 --> A11[URLs]
    A8 --> A12[Media Files]
    B1 --> A13[Time Order]
    B1 --> A14[Display Order]
    A8 --> A16[Locations]
    A8 --> A17[Flat message]
    A19 --> A18[Token Count]
    A17 --> A19[Tokenized Version]

    %% Define a class for the colored nodes
    classDef colored fill:#a3d9ff, stroke:#333, stroke-width:2px;

    %% Apply the class to nodes B1, B2, B3, C1, and A9-A19
    class D1,A20,B1,B2,B3,C1,A9,A10,A11,A12,A13,A14,A15,A16,A17,A18,A19 colored;

To do this parsing by hand, we would need to:

  • build our own list of Regexes to detect WhatsApp System Messages
  • create RegExes to detect all varieties of timestamps
  • create RegExes to detect media files
  • Do all of this for all combinations of operating systems, languages, time settings and for exports including and not including media files

Luckily, the WhatsR package includes all of this already and does it as automatically as possible! Lets use it to parse a chat log file!

# opening documentation
??parse_chat()
# Running example:
data <- parse_chat(
  path = system.file("englishandroid24h.txt", package = "WhatsR"),
  anonymize = FALSE
  )
colnames(data)
 [1] "DateTime"          "Sender"            "Message"          
 [4] "Flat"              "TokVec"            "URL"              
 [7] "Media"             "Location"          "Emoji"            
[10] "EmojiDescriptions" "Smilies"           "SystemMessage"    
[13] "TokCount"          "TimeOrder"         "DisplayOrder"     
dim(data)
[1] 50 15
head(data)
DateTime Sender Message Flat TokVec URL Media Location Emoji EmojiDescriptions Smilies SystemMessage TokCount TimeOrder DisplayOrder
2018-01-29 12:24:00 WhatsApp System Message NA NA NA NA NA NA NA NA NA Messages and calls are end-to-end encrypted. No one outside of this chat, not even WhatsApp, can read or listen to them. Tap to learn more. NA 1 1
2018-01-29 12:24:00 WhatsApp System Message NA NA NA NA NA NA NA NA NA You created group “WhatsAppParserTest🙈” NA 2 2
2018-01-29 12:24:00 WhatsApp System Message NA NA NA NA NA NA NA NA NA You added Mallory NA 3 3
2018-01-29 12:24:00 Mallory Hey! :) Hey hey NA NA NA NA NA :) NA 1 4 4
2018-01-29 12:24:00 WhatsApp System Message NA NA NA NA NA NA NA NA NA You removed Mallory NA 5 5
2018-01-29 12:25:00 WhatsApp System Message NA NA NA NA NA NA NA NA NA You added Alice NA 7 6

If you’re using the GitHub dev version of WhatsR, the parsed dataframe also contains the inferred phone language, detected operating system, and time the exported chat log was parsed as attributes. The CRAN version of the package does not contain this feature yet.

attributes(data)$language
[1] "english"
attributes(data)$detectedOS
[1] "android"
as.POSIXct(attributes(data)$parsedAt)
[1] "2025-03-24 23:54:34 CET"




Explore your own data

You can now use the parse_chat function to parse the example chat files or your own chat files. You can also compare your own chat logs and the provided example chat logs:

  • Is everything parsed correctly?
  • Does your data contain all variables?
  • Is there anything that is going wrong?
  • Is there anything you are missing?
Important

The structure of WhatsApp chat log files changes multiple times per year, unannouncedly and unpredictably. If the parser does not work for you exported chat file, the structure has changed again since the last update of the package. Please open an issue on the GitHub repository of the package, so I can try to update the parser accordingly.

Parsing Data

# TODO: replace the path variable with the path to your txt file
data <- parse_chat("PATH_TO_YOUR_FILE", anonymize = FALSE)

Summarizing parsed Data

dim(data)
[1] 50 15
attributes(data)$language
[1] "english"
attributes(data)$detectedOS
[1] "android"
as.POSIXct(attributes(data)$parsedAt)
[1] "2025-03-24 23:54:34 CET"
??summarize_chat
summarize_chat(data, exclude_sm = TRUE)
$NumberOfMessages
[1] 42

$NumberOfTokens
[1] 380

$NumberOfParticipants
[1] 5

$StartDate
[1] "2018-01-29 12:24:00 UTC"

$EndDate
[1] "2018-01-30 00:23:00 UTC"

$TimeSpan
Time difference of 11.98333 hours

$NumberOfSystemMessages
[1] 0

$NumberOfEmoji
[1] 3

$NumberOfSmilies
[1] 5

$NumberOfLinks
[1] 4

$NumberOfMedia
[1] 1

$NumberOfLocation
[1] 2
??summarize_tokens_per_person
summarize_tokens_per_person(data, exclude_sm = TRUE)
$Mallory
$Mallory$Timespan
$Mallory$Timespan$Start
[1] "2018-01-29 12:24:00 UTC"

$Mallory$Timespan$End
[1] "2018-01-30 00:23:00 UTC"


$Mallory$TokenStats
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      1       1       1       1       1       1 


$Dave
$Dave$Timespan
$Dave$Timespan$Start
[1] "2018-01-29 12:24:00 UTC"

$Dave$Timespan$End
[1] "2018-01-30 00:23:00 UTC"


$Dave$TokenStats
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    2.0     2.0     2.5     3.5     4.0     7.0 


$Alice
$Alice$Timespan
$Alice$Timespan$Start
[1] "2018-01-29 12:24:00 UTC"

$Alice$Timespan$End
[1] "2018-01-30 00:23:00 UTC"


$Alice$TokenStats
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    1.0     4.0     4.5     9.8     7.5    56.0       3 


$Bob
$Bob$Timespan
$Bob$Timespan$Start
[1] "2018-01-29 12:24:00 UTC"

$Bob$Timespan$End
[1] "2018-01-30 00:23:00 UTC"


$Bob$TokenStats
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
   2.00    3.00    6.00   11.12   13.00   56.00       2 


$Carol
$Carol$Timespan
$Carol$Timespan$Start
[1] "2018-01-29 12:24:00 UTC"

$Carol$Timespan$End
[1] "2018-01-30 00:23:00 UTC"


$Carol$TokenStats
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
   4.00    4.75    9.00   19.50   23.75   56.00       1 

Visualizing your own Data

The WhatsR package has a variety of functions to visualize your data. You can check some examples here and check out your own dataset!

# Overview of all available functions
??WhatsR

Visualizing Messaging

# checking documentation
??plot_messages

# Plotting amount of messages
p1 <- plot_messages(data, plot = "bar", exclude_sm = TRUE)

p2 <- plot_messages(data, plot = "cumsum", exclude_sm = TRUE)

p3 <- plot_messages(data, plot = "heatmap", exclude_sm = TRUE)

p4 <- plot_messages(data, plot = "pie", exclude_sm = TRUE)

# checking documentation
??plot_tokens

# Plotting amount of messages
p5 <- plot_tokens(data, plot = "bar", exclude_sm = TRUE);p5

p6 <- plot_tokens(data, plot = "box", exclude_sm = TRUE);p6

p7 <- plot_tokens(data, plot = "violin", exclude_sm = TRUE);p7

p8 <- plot_tokens(data, plot = "cumsum", exclude_sm = TRUE);p8

# socumentation
??plot_tokens_over_time

# Plotting amount of tokens over time
p9 <- plot_tokens_over_time(data,
                            plot = "year",
                            exclude_sm = TRUE)

p10 <- plot_tokens_over_time(data,
                             plot = "day",
                             exclude_sm = TRUE)

p11 <- plot_tokens_over_time(data,
                             plot = "hour",
                             exclude_sm = TRUE)

p12 <- plot_tokens_over_time(data,
                             plot = "heatmap",
                             exclude_sm = TRUE)

p13 <- plot_tokens_over_time(data,
                             plot = "alltime",
                             exclude_sm = TRUE)

Plotting sent Smilies

# documentation
??plot_smilies

# Plotting amount of smilies
p18 <- plot_smilies(data, plot = "bar", exclude_sm = TRUE)

p19 <- plot_smilies(data, plot = "splitbar", exclude_sm = TRUE)

p20 <- plot_smilies(data, plot = "heatmap", exclude_sm = TRUE);p20

p21 <- plot_smilies(data, plot = "cumsum", exclude_sm = TRUE)

Plotting Sent Emoji

# checking documentation
??plot_emoji

# Plotting amount of messages
p22 <- plot_emoji(data,
 plot = "bar",
 min_occur = 1,
 exclude_sm = TRUE,
 emoji_size = 5)

p23 <- plot_emoji(data,
 plot = "splitbar",
 min_occur = 1,
 exclude_sm = TRUE,
 emoji_size = 5);p23

p24 <- plot_emoji(data,
 plot = "heatmap",
 min_occur = 1,
 exclude_sm = TRUE);p24

p25 <- plot_emoji(data,
 plot = "cumsum",
 min_occur = 1,
 exclude_sm = TRUE)

Checking Reaction Times

# check documentation
??plot_replytimes

# Plotting distribution of reaction times
p26 <- plot_replytimes(data,
                       type = "replytime",
                       exclude_sm = TRUE)

p27 <- plot_replytimes(data,
                       type = "reactiontime",
                       exclude_sm = TRUE)

Checking Lexical Dispersion

# checking documentation
??plot_lexical_dispersion

# Plotting lexical dispersion 
# TODO: change the TESTWORD to a word conatined in your chat
plot <- plot_lexical_dispersion(data,
                                  keywords = c("problem"),
                                  exclude_sm = TRUE)
# printing the plot
plot

Plotting Network of Replies

# checking documentation
??plot_network

# Plotting response network
plot_network(data,
             edgetype = "n",
             collapse_sessions = TRUE,
             exclude_sm = TRUE)

Plotting Wordclouds

# checking documentation
??plot_wordcloud

# plotting wordcloud
wordcloud <- plot_wordcloud(data, exclude_sm = TRUE)

split_wordcloud <- plot_wordcloud(data,
                                  exclude_sm = TRUE,
                                  comparison = TRUE,
                                  min_occur = 3)